Inderjit S. Dhillon
(http://www.cs.utexas.edu/~inderjit/)
Tuesday 3rd July 2012
Time: ***11am***
B10 Seminar Room, Basement,
Alexandra House, 17 Queen Square, London, WC1N 3AR
Fast Coordinate Descent Methods with Variable Selection for Non-negative Matrix Factorization
Nonnegative Matrix Factorization (NMF) is an effective dimension reduction method for non-negative matrices,
and has proven to be useful in many areas, such as text mining, bioinformatics and image processing. In this
talk, I will present a new co-ordinate descent algorithm for solving the least squares NMF problem.
The
algorithm uses a variable selection scheme that employs the gradient of the objective function. This new
method is considerably faster in practice, especially when the solution is sparse, as is often the case
in real applications. In these cases, our method benefits by selecting important variables to update more
often, thus resulting in higher speed.
As an example, on a text dataset RCV1, our method is 7 times faster
than FastHals, which is a cyclic co-ordinate descent algorithm, and more than 15 times faster when the
sparsity is increased by adding an L1 penalty.
We also develop new coordinate descent methods when error in
NMF is measured by KL- divergence by applying the Newton method to solve the one-variable sub-problems.
Experiments indicate that our algorithm for minimizing the KL-divergence is faster than the Lee & Seung
multiplicative rule by a factor of 10 on the CBCL image dataset.
This is joint work with Cho-Jui Hsieh